AITopics | rate distribution

Collaborating Authors

rate distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Category-Aware Semantic Caching for Heterogeneous LLM Workloads

Wang, Chen, Liu, Xunzhuo, Zhu, Yue, Youssef, Alaa, Nagpurkar, Priya, Chen, Huamin

arXiv.org Artificial IntelligenceNov-3-2025

LLM serving systems process heterogeneous query workloads where different categories exhibit different characteristics. Code queries cluster densely in embedding space while conversational queries distribute sparsely. Content staleness varies from minutes (stock data) to months (code patterns). Query repetition patterns range from power-law (code) to uniform (conversation), producing long tail cache hit rate distributions: high-repetition categories achieve 40-60% hit rates while low-repetition or volatile categories achieve 5-15% hit rates. Vector databases must exclude the long tail because remote search costs (30ms) require 15--20% hit rates to break even, leaving 20-30% of production traffic uncached. Uniform cache policies compound this problem: fixed thresholds cause false positives in dense spaces and miss valid paraphrases in sparse spaces; fixed TTLs waste memory or serve stale data. This paper presents category-aware semantic caching where similarity thresholds, TTLs, and quotas vary by query category. We present a hybrid architecture separating in-memory HNSW search from external document storage, reducing miss cost from 30ms to 2ms. This reduction makes low-hit-rate categories economically viable (break-even at 3-5% versus 15-20%), enabling cache coverage across the entire workload distribution. Adaptive load-based policies extend this framework to respond to downstream model load, dynamically adjusting thresholds and TTLs to reduce traffic to overloaded models by 9-17% in theoretical projections.

category, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.26835

Genre: Research Report (0.40)

Industry:

Health & Medicine (0.68)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Add feedback

Intelligent Learning Rate Distribution to reduce Catastrophic Forgetting in Transformers

Kenneweg, Philip, Schulz, Alexander, Schröder, Sarah, Hammer, Barbara

arXiv.org Artificial IntelligenceMar-27-2024

Pretraining language models on large text corpora is a common practice in natural language processing. Fine-tuning of these models is then performed to achieve the best results on a variety of tasks. In this paper, we investigate the problem of catastrophic forgetting in transformer neural networks and question the common practice of fine-tuning with a flat learning rate for the entire network in this context. We perform a hyperparameter optimization process to find learning rate distributions that are better than a flat learning rate. We combine the learning rate distributions thus found and show that they generalize to better performance with respect to the problem of catastrophic forgetting.

dataset, learning rate, rate distribution, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-21753-1_25

2404.01317

Country: Europe > Germany (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Selectivity and Metaplasticity in a Unified Calcium-Dependent Model

Yeung, Luk Chong, Blais, Brian S., Cooper, Leon N., Shouval, Harel Z.

Neural Information Processing SystemsDec-31-2003

A unified, biophysically motivated Calcium-Dependent Learning model has been shown to account for various rate-based and spike time-dependent paradigms for inducing synaptic plasticity. Here, we investigate the properties of this model for a multi-synapse neuron that receives inputs with different spike-train statistics. In addition, we present a physiological form of metaplasticity, an activity-driven regulation mechanism, that is essential for the robustness of the model.

plasticity, synapse, synaptic plasticity, (15 more...)

Neural Information Processing Systems

Country: